# Ay 119 - Some RDF exercises

Here we'll go through some examples of using Python to read RDF documents from NeuroLex.org and make simple queries. NeuroLex.org is a freely editable semantic wiki for community-based curation of the terms used in Neuroscience.

### A simple example

Firstly, we'll get an RDF document and count how many statements it contains.

In [4]:
import rdflib

# Get a Graph object
g = rdflib.Graph()

# Retrieve an RDF document from NeuroLex and parse it
result = g.parse("http://neurolex.org/wiki/Special:ExportRDF/birnlex_1489", format="application/rdf+xml")

print "Graph has %s statements" % len(g)

Graph has 483 statements


We can also iterate over all the statements in the graph:

In [10]:
for s, p, o in g:
 print s, p, o

http://neurolex.org/wiki/Category-3AResource-3AFunctional_Anatomy_of_the_Cerebro-2DCerebellar_System_(FACCS) http://neurolex.org/wiki/Property-3AKeywords http://neurolex.org/wiki/Birnlex_1489
http://neurolex.org/wiki/Property-3AAuthors http://www.w3.org/2000/01/rdf-schema#isDefinedBy http://neurolex.org/wiki/Special:ExportRDF/Property:Authors
http://neurolex.org/wiki/Category-3AResource-3ASpatially_unbiased_atlas_template_of_the_cerebellum_and_brainstem http://www.w3.org/2000/01/rdf-schema#label Resource:Spatially unbiased atlas template of the cerebellum and brainstem
http://neurolex.org/wiki/Property-3AAbout http://www.w3.org/2000/01/rdf-schema#isDefinedBy http://neurolex.org/wiki/Special:ExportRDF/Property:About
http://neurolex.org/wiki/Category-3AInfracerebellar_nucleus http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#Class
http://neurolex.org/wiki/Category-3ARexed_lamina_VII http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/2002/07/owl#

### Querying the graph

When working with big documents (RDF stores), we really want to be querying rather than iterating. So we'll use SPARQL to find a specific statement, the definition, and print it. The SPARQL query is:

 SELECT DISTINCT ?b
 WHERE {
 ?a property:Definition ?b .
 }


In [52]:
import rdflib

# Get the graph
g = rdflib.Graph()
result = g.parse("http://neurolex.org/wiki/Special:ExportRDF/birnlex_1489", format="application/rdf+xml")

# Query the graph
qres = g.query("""SELECT DISTINCT ?b WHERE {?a property:Definition ?b .}""",
 initNs = dict(
 property=rdflib.Namespace("http://neurolex.org/wiki/Property-3A"),
 wiki=rdflib.Namespace("http://neurolex.org/wiki/")))

print "Definition: %s" % qres.bindings[0]['b']

Definition: Part of the rhombencephalon that lies in the posterior cranial fossa behind the brain stem, consisting of the cerebellar cortex, deep cerebellar nuclei and cerebellar white matter.
A portion of the brain that helps regulate posture, balance, and coordination. (NIDA Media Guide Glossary)

The dorsal topographic division of the hindbrain, connected to the ventral division-the pons-by a white matter tract, the middle cerebellar peduncle. The cerebellum was discovered and named by Aristotle (De Partibus Animalium) based on macrodissection of a variety of animals not including humans; see translation by Thompson (1910, 494b 30). Older synonyms include parencephalon (Aristotle), hindbrain (Galen, c192).


Now let's add a second term to the query:

 SELECT DISTINCT ?label ?def
 WHERE {
 ?a property:Label ?label .
 ?a property:Definition ?def .
 }

In [51]:
import rdflib

# Get the graph
g = rdflib.Graph()
result = g.parse("http://neurolex.org/wiki/Special:ExportRDF/birnlex_1489", format="application/rdf+xml")

# Query the graph
qres = g.query("""SELECT DISTINCT ?label ?def WHERE {?a property:Label ?label . ?a
property:Definition ?def .}""",
 initNs = dict(
 property=rdflib.Namespace("http://neurolex.org/wiki/Property-3A"),
 wiki=rdflib.Namespace("http://neurolex.org/wiki/")))

print "Label: %s" % qres.bindings[0]['label'].value
print "Definition: %s" % qres.bindings[0]['def'].value

Label: Cerebellum
Definition: Part of the rhombencephalon that lies in the posterior cranial fossa behind the brain stem, consisting of the cerebellar cortex, deep cerebellar nuclei and cerebellar white matter.
A portion of the brain that helps regulate posture, balance, and coordination. (NIDA Media Guide Glossary)

The dorsal topographic division of the hindbrain, connected to the ventral division-the pons-by a white matter tract, the middle cerebellar peduncle. The cerebellum was discovered and named by Aristotle (De Partibus Animalium) based on macrodissection of a variety of animals not including humans; see translation by Thompson (1910, 494b 30). Older synonyms include parencephalon (Aristotle), hindbrain (Galen, c192).


### Looping look-up

Now let's take a list of NeuroLex ids and encode them:

sao185580330
PATO_0001463
nlx_15593
nlx_147643
sao1394521419
birnlex_826
birnlex_1565
birnlex_1197
birnlex_1373
sao1211023249
birnlex_147
birnlex_1508
nlx_inv_090914
sao1744435799
nlx_414
birnlex_12500
CHEBI:15765
birnlex_7004
GO:0040011
PATO_0000051
sao1417703748
birnlex_727
birnlex_2339
birnlex_2098
nlx_subcell_20090508
nlx_subcell_20090511
birnlex_1298
birnlex_1672
sao914572699
sao1071221672
birnlex_2337
birnlex_954
sao221389602

In [55]:
import rdflib

ids = ['sao185580330', 'PATO_0001463', 'nlx_15593', 'nlx_147643', 'sao1394521419', 'birnlex_826', 
 'birnlex_1565', 'birnlex_1197', 'birnlex_1373', 'sao1211023249', 'birnlex_147', 
 'birnlex_1508', 'nlx_inv_090914', 'sao1744435799', 'nlx_414', 'birnlex_12500', 
 'CHEBI:15765', 'birnlex_7004', 'GO:0040011', 'PATO_0000051', 'sao1417703748', 
 'birnlex_727', 'birnlex_2339', 'birnlex_2098', 'nlx_subcell_20090508', 
 'nlx_subcell_20090511', 'birnlex_1298', 'birnlex_1672', 'sao914572699', 'sao1071221672', 
 'birnlex_2337', 'birnlex_954', 'sao221389602']

# Loop over the ids
for id in ids:
 # Get the graph
 g = rdflib.Graph()
 result = g.parse("http://neurolex.org/wiki/Special:ExportRDF/%s" % id, 
 format="application/rdf+xml")

 # Query the graph
 qres = g.query("""SELECT DISTINCT ?label ?def WHERE {?a property:Label ?label . ?a 
 property:Definition ?def .}""",
 initNs = dict(
 property=rdflib.Namespace("http://neurolex.org/wiki/Property-3A"),
 wiki=rdflib.Namespace("http://neurolex.org/wiki/")))

 try:
 print "Label: %s" % qres.bindings[0]['label'].value
 print "Definition: %s" % qres.bindings[0]['def'].value
 except IndexError:
 print id
 pass
 print 
 

Label: Acetylcholine
Definition: A neurotransmitter. Acetylcholine in vertebrates is the major transmitter at neuromuscular junctions, autonomic ganglia, parasympathetic effector junctions, a subset of sympathetic effector junctions, and at many sites in the central nervous system. It is generally not used as an administered drug because it is broken down very rapidly by cholinesterases, but it is useful in some ophthalmological applications (MSH).

Label: Action potential
Definition: A large, brief, all-or-nothing, regenerative electrical potential, that propagates along the axon, muscle fiber, or some dendrites. Action potentials are usually generated at the axon hillock and propagate uni-directionally down the axon. Adapted from Nicholls, Martin and Wallace 3rd edition.

Label: Afferent
Definition: An axon that is incoming into a brain region. Could also refer to the neuron that gives rise to the axon.

(A connection, or a pathway to a node. A node can be an individual neuron, a neu